pacman::p_load(sf, tidyverse, tmap, spNetwork, sp,spatstat ,dplyr)Takehome_Ex01
1 Setting the Scene
Road traffic accidents are a significant global issue, causing approximately 1.19 million deaths and leaving 20 to 50 million people with non-fatal injuries annually, according to the World Health Organization (WHO). Vulnerable road users, like pedestrians, cyclists, and motorcyclists, make up more than half of these fatalities, with most incidents occurring in low- and middle-income countries. Thailand, in particular, has one of the highest road traffic death rates in Southeast Asia, with around 20,000 deaths each year. Economic impacts are also substantial, costing countries about 3% of their GDP. In Thailand, national highways see the highest concentration of accidents, especially in straight road sections and other accident-prone zones like intersections and curves.
2 Objectives
3 Data Collection
3.1 Loading Packages
3.2 Data Collect
Three data sets are used in this exercise, they are:
Thailand Road Accident [2019-2022] on Kaggle. This dataset offers comprehensive statistics on road accidents recorded in Thailand from approximately 2019 to 2022, covering various aspects of the incidents.
Thailand Roads (OpenStreetMap Export) on HDX. This dataset, sourced from OpenStreetMap, provides a detailed map of Thailand’s road network.
Thailand - Subnational Administrative Boundaries on HDX. This dataset provides comprehensive geographic data on Thailand’s subnational administrative boundaries, covering provinces, districts, and subdistricts.
4 KDE analysis
In this part, we will conduct a kernel density analysis for the Bangkok Metropolitan Region (BMR) to simply visualize and identify areas with a higher concentration of car accidents.
4.1 Data Preparation
4.4.1 Data import
acc <- read.csv("data/aspatical/thai_road_accident_2019_2022.csv") tr <- st_read(dsn="data/geospatial", layer = 'hotosm_tha_roads_lines_shp')Reading layer `hotosm_tha_roads_lines_shp' from data source
`D:\FuWanqian\ISSS608-VAA\Takehome_Ex\Takehome_Ex01\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 2792361 features and 14 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: 97.34457 ymin: 5.643645 xmax: 105.6528 ymax: 20.47168
Geodetic CRS: WGS 84
ad <- st_read(dsn="data/geospatial", layer = 'tha_admbnda_adm1_rtsd_20220121')Reading layer `tha_admbnda_adm1_rtsd_20220121' from data source
`D:\FuWanqian\ISSS608-VAA\Takehome_Ex\Takehome_Ex01\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 77 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 97.34336 ymin: 5.613038 xmax: 105.637 ymax: 20.46507
Geodetic CRS: WGS 84
Here we first import the administrative level 1 data, which includes the provincial boundaries of Thailand.
4.4.2 Data selection
bmr_acc<- acc %>%
filter(province_en %in% c("Bangkok", "Nakhon Pathom", "Pathum Thani",
"Nonthaburi", "Samut Prakan", "Samut Sakhon"))bmr_ad<- ad %>%
filter(ADM1_EN %in% c("Bangkok", "Nakhon Pathom", "Pathum Thani",
"Nonthaburi", "Samut Prakan", "Samut Sakhon"))Here we filter the BMR data from both the accident data and the provincial boundary data of Thailand.”
4.4.3 Drop missing value for accident data
missing_coords <- bmr_acc[is.na(bmr_acc$longitude) | is.na(bmr_acc$latitude), ]
missing_coords_count <- nrow(missing_coords)
missing_coords_count[1] 350
The number of missing coordinates is 350, so using below codes to drop missing value.
bmr_acc2 <- bmr_acc[!is.na(bmr_acc$longitude) & !is.na(bmr_acc$latitude), ]4.4.4 Transform data format
It is important for us to ensure that all the data are projected in same projection system, then here we transform BMR accident data and BMR boundary to EPSG:32647 (UTM Zone 47N).
bmr_acc_data <- st_as_sf(bmr_acc2, coords = c("longitude", "latitude"), crs = 4326)
bmr_acc_data_utm <- st_transform(bmr_acc_data, crs = 32647)
bmr_acc_data_utmSimple feature collection with 12986 features and 16 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 591277.5 ymin: 1486846 xmax: 710166.1 ymax: 1576520
Projected CRS: WGS 84 / UTM zone 47N
First 10 features:
acc_code incident_datetime report_datetime province_th province_en
1 571882 2019/1/1 2:25 2019/1/2 17:32 นครปฐม Nakhon Pathom
2 600001 2019/1/1 3:00 2019/1/5 10:33 นนทบุรี Nonthaburi
3 605043 2019/1/1 3:00 2019/3/29 8:22 สมุทรปราการ Samut Prakan
4 629691 2019/1/1 3:05 2019/1/1 3:05 กรุงเทพมหานคร Bangkok
5 571887 2019/1/1 4:30 2019/1/2 17:32 นครปฐม Nakhon Pathom
6 599234 2019/1/1 4:45 2019/1/2 8:28 สมุทรปราการ Samut Prakan
7 599990 2019/1/1 5:30 2019/1/5 10:35 สมุทรสาคร Samut Sakhon
8 612045 2019/1/1 5:30 2019/10/9 13:50 นนทบุรี Nonthaburi
9 629689 2019/1/1 5:42 2019/1/1 5:42 กรุงเทพมหานคร Bangkok
10 607046 2019/1/1 6:30 2019/5/15 13:24 ปทุมธานี Pathum Thani
agency
1 department of rural roads
2 department of highways
3 department of highways
4 expressway authority of thailand
5 department of rural roads
6 department of highways
7 department of highways
8 department of highways
9 expressway authority of thailand
10 department of highways
route vehicle_type
1 แยกทางหลวงหมายเลข 4 (กม.ที่ 51+920) - บ้านวัดละมุด motorcycle
2 คลองวัดแดง - บางบัวทอง private/passenger car
3 ราษฎร์บูรณะ - พระสมุทรเจดีย์ private/passenger car
4 บางพลี-สุขสวัสดิ์ other
5 แยกทางหลวงหมายเลข 3310 (กม.ที่ 10+700) - บ้านกลาง motorcycle
6 บางพลี - บางบ่อ motorcycle
7 แสมดำ - สะพานข้ามแม่น้ำท่าจีนฝั่งตะวันตก motorcycle
8 แยกพงษ์เพชร - สะพานพระนั่งเกล้า private/passenger car
9 บางนา-ชลบุรี other
10 ปทุมธานี - ท้ายเกาะ private/passenger car
presumed_cause
1 speeding
2 speeding
3 running red lights/traffic signals
4 other
5 speeding
6 driving under the influence of alcohol
7 speeding
8 cutting in closely by people/vehicles/animals
9 other
10 speeding
accident_type number_of_vehicles_involved
1 rollover/fallen on straight road 1
2 rollover/fallen on straight road 1
3 collision at intersection corner 2
4 other 1
5 rollover/fallen on straight road 1
6 rollover/fallen on straight road 1
7 rollover/fallen on straight road 1
8 rollover/fallen on straight road 1
9 other 1
10 rollover/fallen on straight road 1
number_of_fatalities number_of_injuries weather_condition road_description
1 0 2 clear straight road
2 0 1 clear straight road
3 0 0 clear other
4 0 1 clear other
5 0 1 clear straight road
6 1 0 clear straight road
7 1 0 clear straight road
8 0 0 clear straight road
9 0 0 clear straight road
10 0 1 clear straight road
slope_description geometry
1 no slope POINT (627012.3 1533381)
2 no slope POINT (655411.7 1531940)
3 other POINT (670903.3 1503427)
4 other POINT (674117.8 1525144)
5 no slope POINT (641347.8 1526237)
6 no slope POINT (693488.9 1502876)
7 no slope POINT (640469.3 1501125)
8 no slope POINT (661585.2 1533339)
9 other POINT (674117.8 1525144)
10 no slope POINT (664551.7 1554801)
Simple feature collection with 12986 features and 16 fields
Geometry type: POINT Dimension: XY
Bounding box: xmin: 591277.5 ymin: 1486846 xmax: 710166.1 ymax: 1576520
Projected CRS: WGS 84 / UTM zone 47N
bmr_ad_data_utm <- st_transform(bmr_ad, crs = 32647)
bmr_ad_data_utmSimple feature collection with 6 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 587893.5 ymin: 1484414 xmax: 712440.5 ymax: 1579076
Projected CRS: WGS 84 / UTM zone 47N
Shape_Leng Shape_Area ADM1_EN ADM1_TH ADM1_PCODE ADM1_REF
1 2.417227 0.13133873 Bangkok กรุงเทพมหานคร TH10 <NA>
2 1.695100 0.07926199 Samut Prakan สมุทรปราการ TH11 <NA>
3 1.251111 0.05323766 Nonthaburi นนทบุรี TH12 <NA>
4 1.884945 0.12698345 Pathum Thani ปทุมธานี TH13 <NA>
5 2.463030 0.17891420 Nakhon Pathom นครปฐม TH73 <NA>
6 1.566369 0.07155983 Samut Sakhon สมุทรสาคร TH74 <NA>
ADM1ALT1EN ADM1ALT2EN ADM1ALT1TH ADM1ALT2TH ADM0_EN ADM0_TH ADM0_PCODE
1 <NA> <NA> <NA> <NA> Thailand ประเทศไทย TH
2 <NA> <NA> <NA> <NA> Thailand ประเทศไทย TH
3 <NA> <NA> <NA> <NA> Thailand ประเทศไทย TH
4 <NA> <NA> <NA> <NA> Thailand ประเทศไทย TH
5 <NA> <NA> <NA> <NA> Thailand ประเทศไทย TH
6 <NA> <NA> <NA> <NA> Thailand ประเทศไทย TH
date validOn validTo geometry
1 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((674339.8 15...
2 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((687139.8 15...
3 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((644817.9 15...
4 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((704086 1575...
5 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((631987.6 15...
6 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((641549.1 15...
Simple feature collection with 6 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 587893.5 ymin: 1484414 xmax: 712440.5 ymax: 1579076
Projected CRS: WGS 84 / UTM zone 47N
4.2 Mapping the data
plot(st_geometry(bmr_ad_data_utm), col = "lightgrey", border = "black", main = "BMR Accident Locations (Base R)")
plot(st_geometry(bmr_acc_data_utm), col = "black", pch = 19, cex = 0.1, add = TRUE) 
tm_shape(bmr_ad_data_utm) +
tm_borders(col = "black", lwd = 1) + borders
tm_shape(bmr_acc_data_utm) +
tm_dots(col = "black", size = 0.01) +
tm_layout(title = "BMR Accident Locations (tmap)")



The first plot shows the locations of traffic accidents in the Bangkok Metropolitan Region, with black dots concentrated in the city center and along major roads, especially near intersections and busy traffic areas. Accident density varies across different provinces, with hotspots mainly distributed along major roads.
4.3 Convert data to ppp format
4.3.1 sp format
bmr_acc_data_sp <- as_Spatial(bmr_acc_data_utm)
bmr_boundary_sp <- as_Spatial(bmr_ad_data_utm)4.3.2 ppp format
acc_coords <- coordinates(bmr_acc_data_sp)
bbox_values <- bbox(bmr_boundary_sp)bmr_window <- owin(xrange = c(bbox_values[1, 1], bbox_values[1, 2]),
yrange = c(bbox_values[2, 1], bbox_values[2, 2]))bmr_acc_ppp <- ppp(x = acc_coords[, 1], y = acc_coords[, 2], window = bmr_window)
plot(bmr_acc_ppp, main = "BMR Accident Locations", cex = 0.5)

4.4 Handling duplicate points
4.4.1 check duplicates
sum(multiplicity(bmr_acc_ppp) > 1)[1] 2293
4.4.2 Apply jitter to handle duplicates
bmr_acc_ppp_jit <- rjitter(bmr_acc_ppp, retry = TRUE, nsim = 1, drop = TRUE)
any(duplicated(bmr_acc_ppp_jit))[1] FALSE
4.5 Combine point events object and owin object
bmr_acc_ppp_final <- bmr_acc_ppp_jit[bmr_window]4.6 Kernel Density Estimation (KDE) Analysis
sigma_value <- 12000
kde_bmr_acc <- density(bmr_acc_ppp_final,
sigma = sigma_value,
edge = TRUE,
kernel = "gaussian") plot(kde_bmr_acc, main = "KDE of Accident Data")
plot(st_geometry(bmr_ad_data_utm), add = TRUE, border = "black", lwd = 2)
From the map, it is clear that central Bangkok and parts of Samut Prakan exhibit the highest density of car accidents, indicating that accidents are more concentrated in these urban areas. In contrast, the outer regions such as Nakhon Pathom, Pathum Thani, and Nonthaburi show much lower accident densities.
4.7 G Function
Null Hypothesis (H0): The spatial distribution of car accidents follows a completely spatially random (CSR) pattern, meaning that accidents occur independently and uniformly over the study area.
Alternative Hypothesis (H1): The spatial distribution of car accidents is not random and exhibits clustering or dispersion, implying that accidents are more likely to occur near each other or further apart than expected under CSR.
G_CK = Gest(bmr_acc_ppp_final, correction = "border")
plot(G_CK, xlim=c(0,500))
The plot shows that the observed G function (black line) rises more quickly than the expected Poisson function (red dashed line), indicating that car accidents are spatially clustered.
5 NKDE analysis
Based on the KDE analysis, we found that Bangkok has the highest concentration of car accidents. Therefore, we refine the NKDE (Network Kernel Density Estimation) analysis to focus specifically on Bangkok’s road network, allowing for a more precise identification of accident hotspots along key roads and intersections.
5.1 Data select
Lots of side roads in the road data may influence the accuracy of the NKDE results, so here we filter the dataset to include only major roads.
main_rd <- tr %>%
filter(highway %in% c("motorway", "trunk", "primary", "secondary"))
saveRDS(main_rd,"data/geospatial/main_rd.rds")Here we select bangkok city boundary.
selected_cities <- c("Bangkok")
bangkok_city <- bmr_ad_data_utm %>%
filter(ADM1_EN %in% selected_cities)
saveRDS(bangkok_city,"data/geospatial/bmr_city.rds")Then we select the main roads within Bangkok by intersecting the road and city boundary datasets.
main_rd_format <- st_transform(main_rd, crs = 32647)
saveRDS(main_rd_format,"data/geospatial/main_rd_format.rds")main_bangkok_roads <- st_intersection(main_rd_format, bangkok_city)saveRDS(main_bangkok_roads, "data/geospatial/main_bangkok_roads2.rds")main_bangkok_roads <- readRDS("data/geospatial/main_bangkok_roads2.rds")Filters the accident data to include only records from the Bangkok province.
bangkok_acc <- bmr_acc_data_utm %>%
filter(province_en %in% "Bangkok")5.2 Transfer to linestring
main_bangkok_roads <- main_bangkok_roads %>%
filter(st_geometry_type(main_bangkok_roads) %in% c("LINESTRING", "MULTILINESTRING"))
main_bangkok_roads <- st_cast(main_bangkok_roads, "LINESTRING", group_or_split = TRUE)5.3 Generate lixels
lixels_bangkok <- lixelize_lines(main_bangkok_roads,
10000,
mindist = 5000)
samples_bangkok <- lines_center(lixels_bangkok)5.4 Perform NKDE
nkde_result_bangkok <- nkde(
lines = lixels_bangkok,
events = bangkok_acc,
w = rep(1, nrow(bangkok_acc)),
samples = samples_bangkok,
kernel_name = "quartic",
bw = 500,
div = "bw",
method = "simple",
grid_shape = c(200, 200),
verbose = TRUE
)nkde_result_bangkok <- readRDS("data/aspatical/nkde_result_bangkok.rds")
head(nkde_result_bangkok, 10) [1] 0.000000e+00 0.000000e+00 0.000000e+00 2.435628e-06 0.000000e+00
[6] 0.000000e+00 3.205377e-06 0.000000e+00 2.432477e-06 0.000000e+00
5.5 Visualize NKDE results
samples_bangkok$density <- nkde_result_bangkok
lixels_bangkok$density <- nkde_result_bangkoksamples_bangkok$density <- samples_bangkok$density * 10000
lixels_bangkok$density <- lixels_bangkok$density * 10000tmap_mode('view')
tm_shape(lixels_bangkok) +
tm_lines(col = "density", palette = "YlOrRd", title.col = "Density", lwd = 2) +
tm_shape(bangkok_acc) +
tm_dots(size = 0.1, col = "blue", alpha = 0.5, title = "Accidents")